Evaluating new techniques on realistic datasets plays a crucial role in the development of ML research and its broader adoption by practitioners. In recent years, there has been a significant increase of publicly available unstructured data resources for computer vision and NLP tasks. However, tabular data -- which is prevalent in many high-stakes domains -- has been lagging behind. To bridge this gap, we present Bank Account Fraud (BAF), the first publicly available privacy-preserving, large-scale, realistic suite of tabular datasets. The suite was generated by applying state-of-the-art tabular data generation techniques on an anonymized,real-world bank account opening fraud detection dataset. This setting carries a set of challenges that are commonplace in real-world applications, including temporal dynamics and significant class imbalance. Additionally, to allow practitioners to stress test both performance and fairness of ML methods, each dataset variant of BAF contains specific types of data bias. With this resource, we aim to provide the research community with a more realistic, complete, and robust test bed to evaluate novel and existing methods.
translated by 谷歌翻译
Teaser: How seemingly trivial experiment design choices to simplify the evaluation of human-ML systems can yield misleading results.
translated by 谷歌翻译
Ithaca is a Fuzzy Logic (FL) plugin for developing artificial intelligence systems within the Unity game engine. Its goal is to provide an intuitive and natural way to build advanced artificial intelligence systems, making the implementation of such a system faster and more affordable. The software is made up by a C\# framework and an Application Programming Interface (API) for writing inference systems, as well as a set of tools for graphic development and debugging. Additionally, a Fuzzy Control Language (FCL) parser is provided in order to import systems previously defined using this standard.
translated by 谷歌翻译
Many natural language related applications involve text generation, created by humans or machines. While in many of those applications machines support humans, yet in few others, (e.g. adversarial machine learning, social bots and trolls) machines try to impersonate humans. In this scope, we proposed and evaluated several mutation-based text generation approaches. Unlike machine-based generated text, mutation-based generated text needs human text samples as inputs. We showed examples of mutation operators but this work can be extended in many aspects such as proposing new text-based mutation operators based on the nature of the application.
translated by 谷歌翻译
Computational pathology can lead to saving human lives, but models are annotation hungry and pathology images are notoriously expensive to annotate. Self-supervised learning has shown to be an effective method for utilizing unlabeled data, and its application to pathology could greatly benefit its downstream tasks. Yet, there are no principled studies that compare SSL methods and discuss how to adapt them for pathology. To address this need, we execute the largest-scale study of SSL pre-training on pathology image data, to date. Our study is conducted using 4 representative SSL methods on diverse downstream tasks. We establish that large-scale domain-aligned pre-training in pathology consistently out-performs ImageNet pre-training in standard SSL settings such as linear and fine-tuning evaluations, as well as in low-label regimes. Moreover, we propose a set of domain-specific techniques that we experimentally show leads to a performance boost. Lastly, for the first time, we apply SSL to the challenging task of nuclei instance segmentation and show large and consistent performance improvements under diverse settings.
translated by 谷歌翻译
Equivariance of neural networks to transformations helps to improve their performance and reduce generalization error in computer vision tasks, as they apply to datasets presenting symmetries (e.g. scalings, rotations, translations). The method of moving frames is classical for deriving operators invariant to the action of a Lie group in a manifold.Recently, a rotation and translation equivariant neural network for image data was proposed based on the moving frames approach. In this paper we significantly improve that approach by reducing the computation of moving frames to only one, at the input stage, instead of repeated computations at each layer. The equivariance of the resulting architecture is proved theoretically and we build a rotation and translation equivariant neural network to process volumes, i.e. signals on the 3D space. Our trained model overperforms the benchmarks in the medical volume classification of most of the tested datasets from MedMNIST3D.
translated by 谷歌翻译
Specular microscopy assessment of the human corneal endothelium (CE) in Fuchs' dystrophy is challenging due to the presence of dark image regions called guttae. This paper proposes a UNet-based segmentation approach that requires minimal post-processing and achieves reliable CE morphometric assessment and guttae identification across all degrees of Fuchs' dystrophy. We cast the segmentation problem as a regression task of the cell and gutta signed distance maps instead of a pixel-level classification task as typically done with UNets. Compared to the conventional UNet classification approach, the distance-map regression approach converges faster in clinically relevant parameters. It also produces morphometric parameters that agree with the manually-segmented ground-truth data, namely the average cell density difference of -41.9 cells/mm2 (95% confidence interval (CI) [-306.2, 222.5]) and the average difference of mean cell area of 14.8 um2 (95% CI [-41.9, 71.5]). These results suggest a promising alternative for CE assessment.
translated by 谷歌翻译
尽管沟通延迟可能会破坏多种系统,但大多数现有的多基因轨迹计划者都缺乏解决此问题的策略。最先进的方法通常采用完美的通信环境,这在现实世界实验中几乎是现实的。本文介绍了强大的Mader(RMADER),这是一个分散的异步多轨迹计划者,可以处理代理商之间的通信延迟。通过广播新优化的轨迹和忠实的轨迹,并执行延迟检查步骤,Rmader即使在通信延迟下也能够保证安全。Rmader通过广泛的仿真和硬件飞行实验得到了验证,并获得了100%的无碰撞轨迹生成成功率,表现优于最先进的方法。
translated by 谷歌翻译
先前的工作表明,深-RL可以应用于无地图导航,包括混合无人驾驶空中水下车辆(Huauvs)的中等过渡。本文介绍了基于最先进的演员批评算法的新方法,以解决Huauv的导航和中型过渡问题。我们表明,具有复发性神经网络的双重评论家Deep-RL可以使用仅范围数据和相对定位来改善Huauvs的导航性能。我们的深-RL方法通过通过不同的模拟场景对学习的扎实概括,实现了更好的导航和过渡能力,表现优于先前的方法。
translated by 谷歌翻译
深钢筋学习中的确定性和随机技术已成为改善运动控制和各种机器人的决策任务的有前途的解决方案。先前的工作表明,这些深-RL算法通常可以应用于一般的移动机器人的无MAP导航。但是,他们倾向于使用简单的传感策略,因为已经证明它们在高维状态空间(例如基于图像的传感的空间)方面的性能不佳。本文在执行移动机器人无地图导航的任务时,对两种深-RL技术 - 深确定性政策梯度(DDPG)和软参与者(SAC)进行了比较分析。我们的目标是通过展示神经网络体系结构如何影响学习本身的贡献,并根据每种方法的航空移动机器人导航的时间和距离提出定量结果。总体而言,我们对六个不同体系结构的分析强调了随机方法(SAC)更好地使用更深的体系结构,而恰恰相反发生在确定性方法(DDPG)中。
translated by 谷歌翻译